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Full text available: f§ pdf(470.55 KB) 



The availability of metadata annotations over media content such as photos is known to 
enhance retrieval and organization, particularly for large data sets. The greatest challenge 
for obtaining annotations remains getting users to perform the large amount of tedious 
manual work that is required. In this paper we introduce an approach for semi-automated 
labeling based on extraction of metadata from naturally occurring conversations of groups 
of people discussing pictures among themselves. As the bu ... 



Keywords: automatic label extraction, collaborative interaction, intelligent interfaces, 
multimodal processing, photo annotation 



IR theory: Table extraction using conditional random fields I I 

David Pinto, Andrew McCallum, Xing Wei, W. Bruce Croft 

July 2003 Proceedings of the 26th annual international ACM SIGIR conference on 
Research and development in informaion retrieval SIGIR '03 
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Full text available- 13 pdf(200.97 KB) Additional Information: full citation , abstract, references , citings, index 
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The ability to find tables and extract information from them is a necessary component of 
data mining, question answering, and other information retrieval tasks. Documents often 
contain tables in order to communicate densely packed, multi-dimensional information. 
Tables do this, by employing layout patterns to efficiently indicate fields and records in 
two-dimensional form.Their rich combination of formatting and content present difficulties 
for traditional language modeling techniques, however. T ... 
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Ferret: a toolkit for content-based similarity search of feature-rich data 
Qin Lv, William Josephson, Zhe Wang, Moses Charikar, Kai Li 

April 2006 ACM SIGOPS Operating Systems Review , Proceedings of the 2006 

EuroSys conference EuroSys '06, Volume 40 issue 4 
Publisher: ACM Press 

Full text available: ^ pdf(2.Q4 MB) Additional Information: full citation , abstract , references , index terms 

Building content-based search tools for feature-rich data has been a challenging problem 
because feature-rich data such as audio recordings, digital images, and sensor data are 
inherently noisy and high dimensional. Comparing noisy data requires comparisons based 
on similarity instead of exact matches, and thus searching for noisy data requires 
similarity search instead of exact search .The Ferret toolkit is designed to help system 
builders quickly construct content-based similarity search system ... 
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Algorithms and theory: A methodology for semantic integration of metadata in 
bioinformatics data sources 

Lei Li, Roop G. Singh, Guangzhi Zheng, Art Vandenberg, Vijay Vaishnavi, Sham Navathe 
March 2005 Proceedings of the 43rd annual Southeast regional conference - Volume 

1 ACM-SE 43 
Publisher: ACM Press 

Full text available: ||| pdf(346.06 KB) Additional Information: full citation , abstract , references , index terms 

Semantic heterogeneity is becoming increasingly prominent in bioinformatics domains 
that deal with constantly expanding, dynamic, often very large, datasets from various 
distributed sources. Metadata is the key component for effective information integration. 
Traditional approaches for reconciling semantic heterogeneity use standards or mediation- 
based methods. These approaches have had limited success in addressing the general 
semantic heterogeneity problem and by themselves are not likely to s ... 
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